Skip to content

bugfix: Fix schema meta-info serialization issue in HBaseRelation#94

Merged
yuanoOo merged 2 commits intooceanbase:mainfrom
yuanoOo:fix-obkv
Jan 26, 2026
Merged

bugfix: Fix schema meta-info serialization issue in HBaseRelation#94
yuanoOo merged 2 commits intooceanbase:mainfrom
yuanoOo:fix-obkv

Conversation

@yuanoOo
Copy link
Collaborator

@yuanoOo yuanoOo commented Jan 23, 2026

Summary

Problem: In a distributed Spark environment, the schema metadata (rowKey and columnFamilyMap) was stored in the HBaseRelation companion object. Since Scala companion objects (singleton/static variables) are not serialized and sent to executors, these variables were re-initialized to their default values (empty) on the executor side. This caused a java.lang.IllegalArgumentException: "" does not exist when flush() attempted to access the row key from the DataFrame schema using the empty field name.

Fix: Refactored HBaseRelation to ensure proper serialization of schema metadata:

Moved rowKey and columnFamilyMap from the companion object to immutable instance variables within the HBaseRelation class. Updated parseCatalog to return a tuple containing the StructType, rowKey, and columnFamilyMap. Updated the flush method to use these instance variables, ensuring that schema mapping information is correctly available on all executors.

fix #91

Solution Description

Problem: In a distributed Spark environment, the schema metadata (rowKey and columnFamilyMap) was stored in the HBaseRelation companion object. Since Scala companion objects (singleton/static variables) are not serialized and sent to executors, these variables were re-initialized to their default values (empty) on the executor side. This caused a java.lang.IllegalArgumentException: "" does not exist when flush() attempted to access the row key from the DataFrame schema using the empty field name.

Fix: Refactored HBaseRelation to ensure proper serialization of schema metadata:

Moved rowKey and columnFamilyMap from the companion object to immutable instance variables within the HBaseRelation class.
Updated parseCatalog to return a tuple containing the StructType, rowKey, and columnFamilyMap.
Updated the flush method to use these instance variables, ensuring that schema mapping information is correctly available on all executors.
@yuanoOo yuanoOo merged commit 8f0f384 into oceanbase:main Jan 26, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: IllegalArgumentException caused by static state in HBaseRelation (Singleton object not serialized to Executors)

2 participants